Vector-Quantization-Based Topic Modeling
نویسندگان
چکیده
With the purpose of learning and utilizing explicit dense topic embeddings, we propose three variations novel vector-quantization-based models (VQ-TMs): (1) Hard VQ-TM, (2) Soft (3) Multi-View VQ-TM. The model family capitalize on vector quantization techniques, embedded input documents, viewing words as mixtures topics. Guided by a comprehensive set evaluation metrics, conduct systematic quantitative qualitative empirical studies, demonstrate superior performance VQ-TMs compared to important baseline models. Through unique case study code generation from natural language descriptions, further illustrate power in downstream tasks.
منابع مشابه
Parametric modeling of intonation using vector quantization
In this study we propose a data-based approach to intonation modeling using vector quantization. The model is based on an F0 parametrization with an especially designed approximation function. The parameter vectors found are vector quantized with varying codebook sizes. This method is motivated by intonation theories that suggest that pitch accent and boundary phenomena can be described by a di...
متن کاملVector Quantization Based Image Compression
An image compression method combining discrete wavelet transform (DWT) and vector quantization (VQ) is presented. First, a three-level DWT is performed on the original image resulting in ten separate sub bands. These sub bands are then vector quantized. VQ indices are Huffman coded to increase the compression ratio. Lloyd extended scalar quantization technique is used to design memory less vect...
متن کاملDivergence based Learning Vector Quantization
We suggest the use of alternative distance measures for similarity based classification in Learning Vector Quantization. Divergences can be employed whenever the data consists of non-negative normalized features, which is the case for, e.g., spectral data or histograms. As examples, we derive gradient based training algorithms in the framework of Generalized Learning Vector Quantization based o...
متن کاملSpeaker Identification Based on Vector Quantization
In this paper a method of text-independent speaker recognition using discrete vector quantization is presented. The identification experiments were performed in a closed set of 599 speakers and two various types of features were tested: cepstral mean subtraction coefficients and mel-frequency cepstral coefficients. The effect of the various codebook size on the speaker identification performanc...
متن کاملContent-Based Image Retrieval Via Vector Quantization
Image retrieval and image compression are each areas that have received considerable attention in the past. In this work, we present an approach for content-based image retrieval (CBIR) using vector quantization (VQ). Using VQ allows us to retain the image database in compressed form without any need to store additional features for image retrieval. The hope is that encoding an image with a cod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Intelligent Systems and Technology
سال: 2021
ISSN: ['2157-6904', '2157-6912']
DOI: https://doi.org/10.1145/3450946